The whole alignment and nothing but the alignment: the problem of spurious alignment flanks

نویسندگان

  • Martin C. Frith
  • Yonil Park
  • Sergey L. Sheetlin
  • John L. Spouge
چکیده

Pairwise sequence alignment is a ubiquitous tool for inferring the evolution and function of DNA, RNA and protein sequences. It is therefore essential to identify alignments arising by chance alone, i.e. spurious alignments. On one hand, if an entire alignment is spurious, statistical techniques for identifying and eliminating it are well known. On the other hand, if only a part of the alignment is spurious, elimination is much more problematic. In practice, even the sizes and frequencies of spurious subalignments remain unknown. This article shows that some common scoring schemes tend to overextend alignments and generate spurious alignment flanks up to hundreds of base pairs/amino acids in length. In the UCSC genome database, e.g. spurious flanks probably comprise >18% of the human-fugu genome alignment. To evaluate the possibility that chance alone generated a particular flank on a particular pairwise alignment, we provide a simple 'overalignment' P-value. The overalignment P-value can identify spurious alignment flanks, thereby eliminating potentially misleading inferences about evolution and function. Moreover, by explicitly demonstrating the tradeoff between over- and under-alignment, our methods guide the rational choice of scoring schemes for various alignment tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Effect of Objective Function on the Optimization of Highway Vertical Alignment by Means of Metaheuristic Algorithms

The main purpose of this work is the comparison of several objective functions for optimization of the vertical alignment. To this end, after formulation of optimum vertical alignment problem based on different constraints, the objective function was considered as four forms including: 1) the sum of the absolute value of variance between the vertical alignment and the existing ground; 2) the su...

متن کامل

IT - Business Strategic Alignment and Organizational Agility: The Moderating Role of Environmental Uncertainty

This study investigates the effect of IT-business strategic alignment on organizational agility by considering the effects of IT flexibility and IT capability on strategic alignment. Also this study investigates the moderating role of environmental uncertainty on the relationship between strategic alignment and organizational agility. This research is an applied research based on purpose and de...

متن کامل

An Application of the ABS LX Algorithm to Multiple Sequence Alignment

We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...

متن کامل

OPTIMIZATION OF VERTICAL ALIGNMENT OF HIGHWAYS IN TERMS OF EARTHWORK COST USING COLLIDING BODIES OPTIMIZATION ALGORITHM

One of the most important factors that affects construction costs of highways is the earthwork cost. On the other hand, the earthwork cost strongly depends on the design of vertical alignment or project line. In this study, at first, the problem of vertical alignment optimization was formulated. To this end, station, elevation and vertical curve length in case of each point of vertical intersec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2008